The OLAC Metadata Set and Controlled Vocabularies
نویسندگان
چکیده
As language data and associated technologies proliferate and as the language resources community rapidly expands, it has become difficult to locate and reuse existing resources. Are there any lexical resources for such-and-such a language? What tool can work with transcripts in this particular format? What is a good format to use for linguistic data of this type? Questions like these dominate many mailing lists, since web search engines are an unreliable way to find language resources. This paper describes a new digital infrastructure for language resource discovery, based on the Open Archives Initiative, and called OLAC – the Open Language Archives Community. The OLAC Metadata Set and the associated controlled vocabularies facilitate consistent description and focussed searching. We report progress on the metadata set and controlled vocabularies, describing current issues and soliciting input from the language resources community.
منابع مشابه
The Open Language Archives Community and Asian Language Resources
The Open Language Archives Community (OLAC) is a new project to build a worldwide system of federated language archives based on the Open Archives Initiative and the Dublin Core Metadata Initiative. This paper aims to disseminate the OLAC vision to the language resources community in Asia, and to show language technologists and linguists how they can document their tools and data in such a way ...
متن کاملVocabulary Conversion : Performance with Controlled and Uncontrolled Terms and Tags Technical
Controlled and uncontrolled indexing terminology and metadata may be converted from one to another. Decision criteria are developed that can be used to determine which terms should be assigned when converting vocabularies. Methods are developed for computing the parameters of these systems, as well as means for estimating the parameters when given limited information. These conversion technique...
متن کاملFind and Combine Vocabularies to Design Metadata Application Profiles using Schema Registries and LOD Resources
A metadata schema which defines constraints about metadata records is a fundamental resource for metadata interoperability. Building interoperable metadata schemas has been a main topic of the Dublin Core since its early days. It is important to make use of existing metadata schemas to develop a new schema in order to minimize newly defined metadata vocabularies, which is how DCMI has developed...
متن کاملAdvanced Search Technologies for Unfamiliar Metadata
Searching of databases (textual or numeric) is likely to be effective and efficient only if the user is familiar with the classification, categorizing, and indexing schemes (metadata vocabularies) being searched. Therefore, it is obviously beneficial to provide a bridge between the user’s ordinary language and the metadata vocabularies of the unfamiliar database in order to compensate for abbre...
متن کاملExtending Dublin Core Metadata to Support the Description and Discovery of Language Resources
As language data and associated technologies proliferate and as the language resources community expands, it is becoming increasingly difficult to locate and reuse existing resources. Are there any lexical resources for such-and-such a language? What tool works with transcripts in this particular format? What is a good format to use for linguistic data of this type? Questions like these dominat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره cs.CL/0105030 شماره
صفحات -
تاریخ انتشار 2001